Picture for Feng Zheng

Feng Zheng

Trans2Occ: Voxel Occupancy Estimation and Grasp for Transparent Objects from Simulation to Reality

Add code
Jun 01, 2026
Viaarxiv icon

MM-Snowball: Evaluating and Mitigating Hallucination Snowballing in Multimodal Multi-Turn Dialogue

Add code
May 30, 2026
Viaarxiv icon

When Policy Entropy Constraint Fails: Preserving Diversity in Flow-based RLHF via Perceptual Entropy

Add code
May 12, 2026
Viaarxiv icon

AnomalyClaw: A Universal Visual Anomaly Detection Agent via Tool-Grounded Refutation

Add code
May 11, 2026
Viaarxiv icon

LiveVLN: Breaking the Stop-and-Go Loop in Vision-Language Navigation

Add code
Apr 21, 2026
Viaarxiv icon

A1: A Fully Transparent Open-Source, Adaptive and Efficient Truncated Vision-Language-Action Model

Add code
Apr 07, 2026
Viaarxiv icon

Structured Causal Video Reasoning via Multi-Objective Alignment

Add code
Apr 06, 2026
Viaarxiv icon

Scalable Object Relation Encoding for Better 3D Spatial Reasoning in Large Language Models

Add code
Mar 25, 2026
Viaarxiv icon

Learning Trajectory-Aware Multimodal Large Language Models for Video Reasoning Segmentation

Add code
Mar 23, 2026
Viaarxiv icon

Show Me When and Where: Towards Referring Video Object Segmentation in the Wild

Add code
Mar 15, 2026
Viaarxiv icon